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Abstract 

Data replications and transaction deadlocks can severely af- 
fect the performance of distributed database systems. Many 
current evaluation techniques ignore these aspects, because it 
is difficult to evaluate through analysis and time-consuming to 
evaluate through simulation. In this paper, we use a technique 
that combines simulation and analysis to closely illustrate the 
impact of deadlock ami evaluate performance of replicated dis- 
tributed database with both shared and exclusive locks. 

1. Introduction. 

A distributed database system (DDS) is a collection of co- 
operating nodes each containing a set of data items. A user 
transaction can enter such a system at any of these nodes. The 
receiving node, often referred to as the coordinating node, un- 
dertakes the task of locating the nodes that contain the data 
items required by a transaction. 

In order to maintain database consistency and correctness 
in the presence of concurrent transactions, several concuriency 
control protocols have been proposed (lj. Of these, the most 
commonly used are time-stamping and locking protocols. Lock- 
ing protocols have been widely used in both commercial and 
research environments. In static locking, prior to start of exe- 
cution, a transaction needs to acquire either a shared-lock (for 
read operations) or an exclusive lock (for update operations) on 
each of the relevant data items. 

Data replication is used to improve the performance of local 
transactions and the availability of databases. In replicated 
databases, one data item may have more than one copy m 
the system. Replica control algorithms are used to maintain 
the consistency among these copies. One of these is the lead- 
one/write-all protocol. With this protocol an exclusive lock 
need to acquire an exclusive lock from every copy of the data 
item . For a shared lock to succeed, any one copy of the data 
item has to he share locked. When transactions with conflicting 
lock request* are initiated concurrently, they could be possibly 
blocked due to a deadlock. 

There are two major ways to evaluate the performance of 
distributed systems: simulation and analysis. Simulation is a 
conceptually tractable technique, but requires large computa- 
tion time. On the other hand, analysis is computationally faster 
but may not be tractable for all problems. In [4], Sliyu and Li 
proposed an elegant analysis model to evaluate the response 
time and throughput of transactions in a non- replicated DDS. 
Assuming exclusive locking (i.c., only write operations), they 
model the queue of lock requests at an object as an M/M/1 
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queue [3j. This results in a closed-form for the waiting time 
distribution at a node, expressed in terms of the average rates 
of arrivals of requests and the average lock-holding time. With 
shared lock and replications added into the picture, it is very 
difficult to have a close model for it. Because of the limita- 
tions of simulation and analysis, we develop a technique that 
combines simulation and analysis. 

This paper is organized as the follows. In Section 2, we de- 
scribe the model used in our performance evaluation. In Section 
3, wc propose an evaluation technique. In Section 4, we illus- 
trate the results. Finally, Section 5 has the conclusions. 


2. Model 

Our model has the following parameters: 

• There are n nodes. 

• There arc d data items in a DDS. 

• A data item may be located at exactly c number of nodes. 
The dc data copies are uniformly distributed across the n 
nodes. 

• Each transaction accesses k data items. 

• r is the read ratio. So among k data items to be accessed, 
rk are accessed only for read operations, and the rest 
arc for read-write operations. Due to the read-one/ write- 
all replica control policy, a transaction must procure rk 
shared locks for rk read operations and (1 — r)kc exclusive 
locks for the (1 - r)k read- write operations. 

• Each data item is equally likely to be accessed by a trans- 
action. 

• Transaction arrivals into the system is a Poisson process 
with rat e A. 

• The communication delay between any two nodes is ex- 
ponentially distributed with mean i. 

• The average execution time of a transaction, once the 
locks are obtained, is 3. 

• The deadlock mechanism is invoked every r seconds. 

• After an abortion of a transaction, it takes an average of 
u> seconds for this transaction to be restaited. 

• /i is the service rate of transactions. 

• b is the lock-holding time. 

• Ac is the arrival rate at each data copy. 
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3. Performance Evaluation Technique 

Our technique consists of two stages. In the first stage, the 
average transaction response time and throughput are calcu- 
lated by ignoring the deadlock. I his is an iterative* step involv- 
ing simulation and analysis. In the second stage, the proba- 
bilities of transaction conflicts and deadlocks are computed by 
probability models. These probabilities are used, in turn, to 
compute the response time and throughput in the presence of 
deadlocks. 

Stage 1: 

Initially, we assume that there are no lock conflicts between 
transactions. Each transaction has to procure rk shared lock 
on data copies and (1 - r)kc exclusive locks on data copies. 
When a transaction has got all the lock grants from these data 
objects, it can go ahead with execution. 

This procedure is summarized in the following f> steps. 

1. Initialize lock-holding time(6) to be l/fi. 

2. Given the total rate of transaction arrival(A), the shared 
lock ratio(r), the number of data items(d), the number of 
data items required by each transaction^* ) and the num- 
ber of replications(c), derive the arrival rate at each data 
copy(Ac). 

3. With the arrival rate at each data copy(Ar), the average 
lock-holding time(6), and the transmission tiine(/) we can 
simulate the queue at a data copy to arrive wait-time(w) 
distribution. With Lius distribution we can calculate the 
response time of transactions. 

4. With the average service time of transactions! 1 //t), and 
the transmission time, we can derive a new lock-holding 
time(6 / ). 

5. Set b to this new lock-holding time U . 

6. If the old and new lock-holding time are sufficiently close, 
stop the iteration. Otherwise, go hack to step 3. 

At the end of stage 1 the response time without the considera- 
tion of transaction deadlocks is obtained. 

Stage 2: 

This stage considers transaction conflicts and computes the 
deadlock probability. Here the probabilities of transaction dead- 
lock and restart are computed. These are then used to compute 
response time and throughput in the presence of deadlocks. 

Assume there are two transactions Tl and T2. Let RS, WS 
be the read and write sets of transactions respectively. 

1. Let fsi be the probability that the readset of Tl has i data 
items overlapping with the writeset of T2, i.e. | RS( I 1)0 

IV5(T2)| = i. 

2. Let feij be the probability that given \ RS(T\ )nll''S(T2)| = 
i, the writeset of Tl has j data items overlaping with 
the readset and wrileset of 12, i.e. the probability lh.it 

\WS(T\)n(l{S{1">)U \VS(T'2))\ = j. 


Clearly, 


”• = — ur~ 

’ (;:::) 

It can also he noted that /s./c.j ’ s ^* e probability that. 

|Rcad-sct(Tl)n\Vnto-set(T2)|=i 

A | Write-set (T 1 )n( Write- set (T2)uRead-set(T2)) | 

If PW,, is the probability that Tl waits for T2, 

PW„ = pi + p2 - pi * p2 

Pi = i-[l-(l/ 2 ) e r 

p2 = (l-(l/2)*>) 


( 1 ) 


( 2 ) 


( 3 ) 

U) 

( 5 ) 


where pi is the probability that Tl waits for T2 for shared locks 
in readset 

and ;>2 is the probability that Tl waits for T2 for exclusive locks 
in writeset. 

Probability that Tl waits for T2 is now given by 

nwn(f — r) min(k-r,*-t) 

Pw = £ E pw » 

i=0 j=Q 

With this probability of waiting and the formulas in |4| we can 
calculate the probability of a transaction deadlock, the piob- 
ability of a transaction restart and the probability of a trans- 
action to be blocked by other transactions. And with these 
probabilities and the time between deadlock detection(r ), we 
ean calculate tbe response time with consideration of deadlock. 
(Details are omelted here.) 


4. Results 

Using this technique, we obtained a number of interesting 
results that illustrate the efFect of deadlocks and number of 
replications on database performance. These are summarized 
in Figures 1-5. We make the following observations. 

• Transaction response Limes are quite sensitive to the ratio 
of shared locks (Figure 1 and 2). Here, we compare the re- 
sponse times when deadlocks are ignored (01, computed in 
Stage 1) with those obtained when deadlocks are consid- 
ered (DC, computed in Stage 2). The effect of deadlocks 
is more predominant at higher transaction loads and with 
smaller values of r. When r = 2/3, the efTect of deadlocks 
is not significant on response time. 

• If we compare Figure l and 2 with Figure 3 and 4, it can 
be observed that the increase in replications results in the 
larger response time when read ratio is smaller than 1/3. 

• Fig. 5 shows the response times with different replication 
numbers. Here we can see that with both cases when 
read ratio is 2/3 and 1/3, the response time increases as 
the number of replications increases. But with read ratio 
equals 1/3, the increasing rate is much smaller than that 
with read ratio equals 2/3. 

5. Conclusions 

In [4], Shyu and Li presented an elegant technique to eval- 
uate the performance of distributed database systems in the 
presence of deadlocks. Their technique assumed only exclusive 
locks and thus representing the worst-case effects of deadlocks. 
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Figure. 2 Comparison of response lime with different 
read ratio when deadlock is considered. 
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Figurc.4 Comparison of response time with different 
read ralio when deadlock is ignored. 
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Figure. 5 Comparison of response time with different 
read ralio wilh and without deadlock. 

DC: Deadlock Considered. 

DI: Deadlock Ignored. 

In this paper, we have extended their technique to combine sim 
ulation and analysis. And with this extended technique we allov 
both shared and exclusive locking and also replications in ou 
model. We evaluated the the clTcct of number of data items, th 
number of data items accessed by each transaction, the ratio c 
read operations on transaction response time and the nuniberc 
replications. These results show the importance of consider^ 
both shared and exclusive lock requests, the deadlock proha 
bilitics as well as the number of replications of database fo 
response time evaluations. 
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